Consider the time to blindness of both eyes in diabetic patients, where one eye is randomly assigned to laser treatment and the other eye serves as a control. Let \(T_1\) and \(T_2\) denote the true failure times (in months) of the treated and untreated eye in the diabeteic retinopathy dataset. Our goal here is to estimate the conditional survival of the treated eye (\(T_1\)) conditional on either failure or survival of the untreated eye (\(T_2\)). Specifically, we estimate both \(P(T_1>60 \mid T_2\leq 36, Z)\) and \(P(T_1>60 \mid T_2> 36, Z)\) using the bivariate pseudo-observations approach. That is, we are interested in the conditional probability that the treated eye survives more than five years, given that the untreated eye has either failed or survived during the first three years, given the covariates.
For this purpose, we use our bivariate pseudo-observations approach based on the Dabrowska estimator, and the logit link function, to estimate the covariate-adjusted bivariate survival probabilities at the three time points \(\{(60,36), (60,0), (0,36)\}\). Consequently, we obtain estimates of the two conditional probabilities \(S_{T_1| T_2\leq 36}(60\mid Z)\) and \(S_{T_1| T_2>36}(60\mid Z)\) using the relationships: \[\begin{align*} S_{T_1| T_2\leq 36}(60\mid Z)&=P(T_1>60 \mid T_2\leq 36, Z)=1-\frac{S(60,36 \mid Z)-S(60,0 \mid Z)-S(0,36 \mid Z)+1}{1-S(0,36 \mid Z)}, \\ S_{T_1| T_2>36}(60\mid Z)&=P(T_1>60 \mid T_2> 36, Z)=\frac{S(60,36 \mid Z)}{S(0,36 \mid Z)}. \end{align*}\]
library(tidyr)
library(plotly)
source("pseudo_observations_functions.R")
Read the dataset (in wide format).
obs <- read.csv("retinopathy_data.csv")
head(obs)
## id obs1 obs2 delta1 delta2 age mean_risk type
## 1 5 46.23 46.23 0 0 28 9.0 adult
## 2 14 42.50 31.30 0 1 12 7.0 juvenile
## 3 16 42.27 42.27 0 0 9 11.0 juvenile
## 4 25 20.60 20.60 0 0 9 11.0 juvenile
## 5 29 38.77 0.30 0 1 13 9.5 juvenile
## 6 46 65.23 54.27 0 1 12 9.0 juvenile
Note that obs1 and obs2 corresponds to the observed times (in months) of the treated and untreated eye, respectively.
t0 <- data.frame(t1=c(60,60,0),t2=c(36,0,36))
t0_merged <- paste0(t0[,1],",",t0[,2], sep = "")
obs <- PO_func_dabrowska_mult_new(obs,t0)
head(obs)
## id obs1 obs2 delta1 delta2 age mean_risk type PO_t1 PO_t2
## 1 1 46.23 46.23 0 0 28 9.0 adult 1.001560e+00 1.0072337
## 2 2 42.50 31.30 0 1 12 7.0 juvenile -8.221699e-02 1.0072337
## 3 3 42.27 42.27 0 0 9 11.0 juvenile 9.780348e-01 0.9923951
## 4 4 20.60 20.60 0 0 9 11.0 juvenile 7.309823e-01 0.8642561
## 5 5 38.77 0.30 0 1 13 9.5 juvenile 1.705303e-13 0.9672061
## 6 6 65.23 54.27 0 1 12 9.0 juvenile 1.075136e+00 1.0541096
## PO_t3
## 1 1.024552e+00
## 2 -9.364188e-02
## 3 1.024552e+00
## 4 8.367084e-01
## 5 1.421085e-14
## 6 1.024552e+00
n <- nrow(obs)
obs_long <- pivot_longer(obs,cols = starts_with("PO_t"))
obs_long$times <- rep(t0_merged,n)
head(obs_long,n = 9)
## # A tibble: 9 × 11
## id obs1 obs2 delta1 delta2 age mean_risk type name value times
## <int> <dbl> <dbl> <int> <int> <int> <dbl> <chr> <chr> <dbl> <chr>
## 1 1 46.2 46.2 0 0 28 9 adult PO_t1 1.00 60,36
## 2 1 46.2 46.2 0 0 28 9 adult PO_t2 1.01 60,0
## 3 1 46.2 46.2 0 0 28 9 adult PO_t3 1.02 0,36
## 4 2 42.5 31.3 0 1 12 7 juvenile PO_t1 -0.0822 60,36
## 5 2 42.5 31.3 0 1 12 7 juvenile PO_t2 1.01 60,0
## 6 2 42.5 31.3 0 1 12 7 juvenile PO_t3 -0.0936 0,36
## 7 3 42.3 42.3 0 0 9 11 juvenile PO_t1 0.978 60,36
## 8 3 42.3 42.3 0 0 9 11 juvenile PO_t2 0.992 60,0
## 9 3 42.3 42.3 0 0 9 11 juvenile PO_t3 1.02 0,36
set.seed(123)
fit <- geese(value~factor(times, levels=t0_merged)+age+mean_risk+type,
id=id, data=obs_long,scale.fix=TRUE,family=gaussian,
jack=TRUE, mean.link="logit",corstr="independence",)
betas <- fit$beta
summary(fit)
##
## Call:
## geese(formula = value ~ factor(times, levels = t0_merged) + age +
## mean_risk + type, id = id, data = obs_long, family = gaussian,
## mean.link = "logit", scale.fix = TRUE, corstr = "independence",
## jack = TRUE)
##
## Mean Model:
## Mean Link: logit
## Variance to Mean Relation: gaussian
##
## Coefficients:
## estimate san.se ajs.se
## (Intercept) 1.657976764 1.2772363 1.29241291
## factor(times, levels = t0_merged)60,0 1.133480665 0.1467074 0.14483368
## factor(times, levels = t0_merged)0,36 0.508826212 0.1071158 0.10573533
## age -0.006300588 0.0159417 0.01617585
## mean_risk -0.177551413 0.1112872 0.11270522
## typejuvenile -0.126652196 0.4917446 0.49597162
## wald p
## (Intercept) 1.68505619 1.942540e-01
## factor(times, levels = t0_merged)60,0 59.69310389 1.110223e-14
## factor(times, levels = t0_merged)0,36 22.56482185 2.031709e-06
## age 0.15620437 6.926754e-01
## mean_risk 2.54541236 1.106150e-01
## typejuvenile 0.06633553 7.967489e-01
##
## Scale is fixed.
##
## Correlation Model:
## Correlation Structure: independence
##
## Returned Error Value: 0
## Number of clusters: 197 Maximum cluster size: 3
des_mat <- model.matrix(value~factor(times, levels=t0_merged)+age+mean_risk+type, data = obs_long)
pred_values <- data.frame('Estimated survival'=exp(des_mat%*%betas)/(1+exp(des_mat%*%betas)),type=obs_long$type, Z=obs_long$age, X=obs_long$mean_risk, time=obs_long$times)
pred_1 <- pred_values[pred_values$time=="60,36",]
pred_2 <- pred_values[pred_values$time=="60,0",]
pred_3 <- pred_values[pred_values$time=="0,36",]
cond_prob1 <- pred_1
cond_prob1$Estimated.survival <- 1-(pred_1$Estimated.survival-pred_2$Estimated.survival-pred_3$Estimated.survival+1)/(1-pred_3$Estimated.survival)
cond_prob1 <- cond_prob1[order(cond_prob1$Z,cond_prob1$X),]
p <- plot_ly(x=cond_prob1$Z, y=cond_prob1$X, z=cond_prob1$Estimated.survival, type="scatter3d", mode="markers",color = cond_prob1$Estimated.survival)
p <- layout(p, scene = list(xaxis = list(title = "Age at diagnosis", range = c(0,60)), yaxis = list(title = "Risk score", range = c(6,12)), zaxis = list(title = "P(T1>5|T2<3)",range = c(0,1))))
p
As can be seen in this interactive figure, the covariate-adjusted conditional probability \(P(T_1>60 \mid T_2\leq 36, Z)\) ranges between 0.5 and 0.77, meaning that even if the untreated eye failed during the first three years, the treated eye still has a relatively high chance to survive more than five years. Additionally, this conditional survival probability of the treated eye is highest as the risk score is lowest, and it decreases as the risk score increases. Age at diagnosis of diabetes seems to also have some effect on this conditional survival probability, which decreases as age at diagnosis increases.
cond_prob2 <- pred_1
cond_prob2$Estimated.survival <- pred_1$Estimated.survival/pred_3$Estimated.survival
cond_prob2 <- cond_prob2[order(cond_prob2$Z,cond_prob2$X),]
p <- plot_ly(x=cond_prob2$Z, y=cond_prob2$X, z=cond_prob2$Estimated.survival, type="scatter3d", mode="markers",color = cond_prob2$Estimated.survival)
p <- layout(p, scene = list(xaxis = list(title = "Age at diagnosis", range = c(0,60)), yaxis = list(title = "Risk score", range = c(6,12)), zaxis = list(title = "P(T1>5|T2>3)",range = c(0,1))))
p
As can be seen in this interactive figure, the covariate-adjusted conditional probability \(P(T_1>60 \mid T_2> 36, Z)\) ranges between 0.73 and 0.84, meaning that when the untreated eye survives for more than three years, the treated eye has a very high chance to survive more than five years. Interestingly, this conditional survival probability remains high even when the age at diagnosis increases, and is almost not affected by the age at diagnosis. However, this conditional probability does decrease as the risk score increases, yet this effect is less substantial than before (when compared to the case where \(T_2\leq 36\)).
In summary, when the untreated eye survives for more than three years, the probability that the treated eye will survive more than five years is very high and is almost not affected by the covariates. However, when the untreated eye fails during the first three years, the probability that the treated eye will survive more than five years is highest when the risk score is lowest and when age at diagnosis is lowest, and it decreases both as a function of age at diagnosis and risk score.